Knowledge Discovery Via Multiple Models

نویسنده

  • Pedro M. Domingos
چکیده

If it is to qualify as knowledge, a learner's output should be accurate, stable and comprehensible. Learning multiple models can improve signiicantly on the accuracy and stability of single models, but at the cost of losing their comprehensibility (when they possess it, as do, for example, simple decision trees and rule sets). This article proposes and evaluates CMM, a meta-learner that seeks to retain most of the accuracy gains of multiple model approaches, while still producing a single com-prehensible model. CMM is based on reapplying the base learner to recover the frontiers implicit in the multiple model ensemble. This is done by giving the base learner a new training set, composed of a large number of examples generated and classiied according to the ensemble, plus the original examples. CMM is evaluated using C4.5RULES as the base learner, and bagging as the multiple-model methodology. On 26 benchmark datasets, CMM retains on average 60% of the accuracy gains obtained by bagging relative to a single run of C4.5RULES, while producing a rule set whose complexity is typically a small multiple (2{6) of C4.5RULES's, and also improving stability. Further studies show that accuracy and complexity can be traded oo by varying the number of artiicial examples generated.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing an Ontology for Knowledge Discovery in Iran’s Vaccine

Ontology is a requirement engineering product and the key to knowledge discovery. It includes the terminology to describe a set of facts, assumptions, and relations with which the detailed meanings of vocabularies among communities can be determined. This is a qualitative content analysis research. This study has made use of ontology for the first time to discover the knowledge of vaccine in Ir...

متن کامل

Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...

متن کامل

Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...

متن کامل

Development of Students’ Creativity through Learning Models in Physical Education during the Covid-19 Pandemic

Background. Physical education learning in the era of the COVID-19 pandemic has a remarkable impact on students’ creativity. Objectives. This study aims to determine the effect of applying the inquiry and discovery models in online physical education learning to develop high school students’ creativity. Methods. The multiple treatment and control with the pre and post-test procedure were used...

متن کامل

Logic of knowledge and discovery via interacting agents - Decision algorithm for true and satisfiable statements

In this paper we study logical properties of the operation chance discovery (CD) via structures based on special Kripke/Hintikka models. These models use as bases partially ordered sets of indexes (indexes of steps in a computation, or ones indicating time points in a time flow), and clusters of states associated to each index. The language chosen to build the logical formulas includes modal/te...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Intell. Data Anal.

دوره 2  شماره 

صفحات  -

تاریخ انتشار 1998